103 research outputs found

    Co-Clustering Network-Constrained Trajectory Data

    Full text link
    Recently, clustering moving object trajectories kept gaining interest from both the data mining and machine learning communities. This problem, however, was studied mainly and extensively in the setting where moving objects can move freely on the euclidean space. In this paper, we study the problem of clustering trajectories of vehicles whose movement is restricted by the underlying road network. We model relations between these trajectories and road segments as a bipartite graph and we try to cluster its vertices. We demonstrate our approaches on synthetic data and show how it could be useful in inferring knowledge about the flow dynamics and the behavior of the drivers using the road network

    Local and global recoding methods for anonymizing set-valued data

    Get PDF
    In this paper, we study the problem of protecting privacy in the publication of set-valued data. Consider a collection of supermarket transactions that contains detailed information about items bought together by individuals. Even after removing all personal characteristics of the buyer, which can serve as links to his identity, the publication of such data is still subject to privacy attacks from adversaries who have partial knowledge about the set. Unlike most previous works, we do not distinguish data as sensitive and non-sensitive, but we consider them both as potential quasi-identifiers and potential sensitive data, depending on the knowledge of the adversary. We define a new version of the k-anonymity guarantee, the k m-anonymity, to limit the effects of the data dimensionality, and we propose efficient algorithms to transform the database. Our anonymization model relies on generalization instead of suppression, which is the most common practice in related works on such data. We develop an algorithm that finds the optimal solution, however, at a high cost that makes it inapplicable for large, realistic problems. Then, we propose a greedy heuristic, which performs generalizations in an Apriori, level-wise fashion. The heuristic scales much better and in most of the cases finds a solution close to the optimal. Finally, we investigate the application of techniques that partition the database and perform anonymization locally, aiming at the reduction of the memory consumption and further scalability. A thorough experimental evaluation with real datasets shows that a vertical partitioning approach achieves excellent results in practice. © 2010 Springer-Verlag.postprin

    A general framework for searching in distributed data repositories

    Get PDF
    This paper proposes a general framework for searching large distributed repositories. Examples of such repositories include sites with music/video content, distributed digital libraries, distributed caching systems, etc. The framework is based on the concept of neighborhood; each client keeps a list of the most beneficial sites according to past experience, which are visited first when the client searches for some particular content. Exploration methods continuously update the neighborhoods in order to follow changes in access patterns. Depending on the application, several variations of search and exploration processes are proposed. Experimental evaluation demonstrates the benefits of the framework in different scenarios.published_or_final_versio

    Inferring Unusual Crowd Events From Mobile Phone Call Detail Records

    Full text link
    The pervasiveness and availability of mobile phone data offer the opportunity of discovering usable knowledge about crowd behaviors in urban environments. Cities can leverage such knowledge in order to provide better services (e.g., public transport planning, optimized resource allocation) and safer cities. Call Detail Record (CDR) data represents a practical data source to detect and monitor unusual events considering the high level of mobile phone penetration, compared with GPS equipped and open devices. In this paper, we provide a methodology that is able to detect unusual events from CDR data that typically has low accuracy in terms of space and time resolution. Moreover, we introduce a concept of unusual event that involves a large amount of people who expose an unusual mobility behavior. Our careful consideration of the issues that come from coarse-grained CDR data ultimately leads to a completely general framework that can detect unusual crowd events from CDR data effectively and efficiently. Through extensive experiments on real-world CDR data for a large city in Africa, we demonstrate that our method can detect unusual events with 16% higher recall and over 10 times higher precision, compared to state-of-the-art methods. We implement a visual analytics prototype system to help end users analyze detected unusual crowd events to best suit different application scenarios. To the best of our knowledge, this is the first work on the detection of unusual events from CDR data with considerations of its temporal and spatial sparseness and distinction between user unusual activities and daily routines.Comment: 18 pages, 6 figure

    De-anonymizable location cloaking for privacy-controlled mobile systems

    Get PDF
    The rapid technology upgrades of mobile devices and the popularity of wireless networks significantly drive the emergence and development of Location-based Services (LBSs), thus greatly expanding the business of online services and enriching the user experience. However, the personal location data shared with the service providers also leave hidden risks on location privacy. Location anonymization techniques transform the exact location of a user into a cloaking area by including the locations of multiple users in the exposed area such that the exposed location is indistinguishable from that of the other users. However in such schemes, location information once perturbed cannot be recovered from the cloaking region and as a result, users of the location cannot obtain fine granular information even when they have access to it. In this paper, we propose Dynamic Reversible Cloaking (DRC) a new de-anonymziable location cloaking mechanism that allows to restore the actual location from the perturbed information through the use of an anonymization key. Extensive experiments using realistic road network traces show that the proposed scheme is efficient, effective and scalable

    Spatial Cloaking Revisited: Distinguishing Information Leakage from Anonymity

    Get PDF
    Abstract. Location-based services (LBS) are receiving increasing popularity as they provide convenience to mobile users with on-demand information. The use of these services, however, poses privacy issues as the user locations and queries are exposed to untrusted LBSs. Spatial cloaking techniques provide privacy in the form of k-anonymity; i.e., they guarantee that the (location of the) querying user u is indistinguishable from at least k-1 others, where k is a parameter specified by u at query time. To achieve this, they form a group of k users, including u, and forward their minimum bounding rectangle (termed anonymizing spatial region, ASR) to the LBS. The rationale behind sending an ASR instead of the distinct k locations is that exact user positions (querying or not) should not be disclosed to the LBS. This results in large ASRs with considerable dead-space, and leads to unnecessary performance degradation. Additionally, there is no guarantee regarding the amount of location information that is actually revealed to the LBS. In this paper, we introduce the concept of information leakage in spatial cloaking. We provide measures of this leakage, and show how we can trade it for better performance in a tunable manner. The proposed methodology directly applies to centralized and decentralized cloaking models, and is readily deployable on existing systems.

    Measuring player’s behaviour change over time in public goods game

    Get PDF
    An important issue in public goods game is whether player's behaviour changes over time, and if so, how significant it is. In this game players can be classified into different groups according to the level of their participation in the public good. This problem can be considered as a concept drift problem by asking the amount of change that happens to the clusters of players over a sequence of game rounds. In this study we present a method for measuring changes in clusters with the same items over discrete time points using external clustering validation indices and area under the curve. External clustering indices were originally used to measure the difference between suggested clusters in terms of clustering algorithms and ground truth labels for items provided by experts. Instead of different cluster label comparison, we use these indices to compare between clusters of any two consecutive time points or between the first time point and the remaining time points to measure the difference between clusters through time points. In theory, any external clustering indices can be used to measure changes for any traditional (non-temporal) clustering algorithm, due to the fact that any time point alone is not carrying any temporal information. For the public goods game, our results indicate that the players are changing over time but the change is smooth and relatively constant between any two time points

    Who let the DOGS out: Anonymous but Auditable communications using Group Signature schemes with Distributed Opening

    Get PDF
    Over the past two decades, group signature schemes have been developed and used to enable authenticated and anonymous peer-to-peer communications. Initial protocols rely on two main authorities, Issuer and Opener, which are given substantial capabilities compared to (regular) participants, such as the ability to arbitrarily identify users. Building efficient, fast, and short group signature schemes has been the focus of a large number of research contributions. However, only a few dealt with the major privacy-preservation challenge of group signatures; this consists in providing user anonymity and action traceability while not necessarily relying on a central and fully trusted authority. In this paper, we present DOGS, a privacy-preserving Blockchain-supported group signature scheme with a distributed Opening functionality. In DOGS, participants no longer depend on the Opener entity to identify the signer of a potentially fraudulent message; they instead collaborate and perform this auditing process themselves. We provide a high-level description of the DOGS scheme and show that it provides both user anonymity and action traceability. Additionally, we prove how DOGS is secure against message forgery and anonymity attacks
    corecore